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ABSTRACT 

Students in two biostatistics courses at the Cornell 
Hedical College and in a course in applications of coaputer science 
given in Cornell's School of Industrial Engineering were given access 
to an interactive package of computer programs enabling them to 
perform statistical analysis without the burden of hand computation. 
After a general discussion of the possible educational impact of the 
package, a brief report is given of its use in the above mentioned 
courses at Cornell. (Author) 
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Some Expe rience with Interactive 
Computing in Teaching Introductory Statistics * 



by 



Carl Diegert 
Center for Environmental 
Quality Management, 
Cornell University 



Background 

The drudgery of hand calculating and memorizing formulas 
typically dominates the experience of students in statistics 
courses. it is unfortunate that the development of computational 
skills has so often obscured conceptual understanding of statis- 
tical inference. With this in mind a group at Cornell prepared 
a package of interactive computer programs designed to substitute 
for some of the students' hand computation. 

The package does not in itself teach statistics, and the 
manual describing it offers no specific instruction in statis- 
tical analysis. Students are expected to have some familiarity 
with the analyses they will attempt, either from a text or course 
lectures. Using the package requires virtually no knowledge of 
computing.fi] The user's manual fully describes the techniques 

u u-^^t^ ^° Professors Mike and Severance, and Ms. 

Mushinski, for making available to me evaluations of the 
courses described in this paper. 
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required for communicating with the computer, and gives out- 
lined steps and example solutions with comments. The inter- 
active analyses are highly conversational ana intended to be 
self-explanatory. 

The package was written to be used within the framework 
of an already crowded course syllabus. However, it can be 
used effectively in large or small doses, depending upon the 
needs of the instructor. Students will benefit from dealing 
with large and complex problems, regardless of how many or 
how varied. Analysis of such problems can demonstrate the 
utility of formal statistical methods. By contrast, analysis 
of small problems need not have this result, for solutions 
here can be arrived at intuitively. 

Why use the computer package? 
The following seem to be the most important advantages 
of using the package. Depending on the extent to which it is 
used, it can play a small but useful role, or affect the en- 
tire curriculum and format of a statistics course. 

Assigned Problems 

1) A wide range of realistic and "relevant" problems 
can be assigned which, without the package, would require 
reduction of too much data, or unmanageably complex calcula- 
tions. In a large class one could, for example, test the 
hypothesis that birthdays are distributed uniformly over 
the 365 days of the year. 



2) Many problems can be assigned. In a survey course 

an instructor can require students to use a variety of analyses, 
without having to set aside class time to teach in detail the 
computational procedures involved. For example, the t-test for 
population mean could be described qualitatively as an extension 
of the test used when population variance is known. 

3) Multiple problems, each involving the same analysis, 
can also be assigned. In the test-of-means er^.aj:le, one could 
"discover" that for a given sample, the confideaie interval 
estimation is more accurate when the population variance is 
known. Repeated analysis of the same data could also demon- 
strate the effect of class interval definition on a frequency 
table and its grouped statistics. 

Critical judgment 

The students are assured a computationally correct 
analysis. Therefore the instructor can place emphasis 
upon their: 1) selecting the method of analysis appropriate 
to the problem at hand; 2) obtaining the relevant data and 
entering them correctly; and 3) giving a proper interpreta- 
tion of the analysis results. For example, the problem might 
be posed of whether the males in the class demonstrated intel- 
lectual superiority over the females in their performance on 
the last examination. The students will then decide on an 
analysis and, perhaps, be asked to justify their selection. 
Should they choose a chi-square test of independence, their 
results might look similar to Figure lA. Using different 



assumptions, they might choose to test for a significant 
difference between sample means, and thus obtain results 
similar to Figure IB. (See next page for Figure 1.) 

Their interpretation of tae output is crucial, and 
will reveal the extent to which they understand the use of 
the arialysis. It will also show whether they grasp the 
general and important point that different kinds of analys 
can be made of the same problem, and that they can yield 
rather different results. 

Learning from experience 

Students using the package can learn from their 
mistakes, something that rarely occurs when they study 
worked problems. For example, suppose students are to 
learn the use of regression analysis in suggesting the 
various models which could explain the relation between 
observed variates. They cannot fully appreciate the use 
of this analysis until they themselves have formulated 
a model, and used the result of the regression calcula- 
tion under this model to formulate another. They will 
learn from this experience, because passing through 
these various stages involves a continual process of 
re-evaluating the data. 

Beyond statistics 

The benefits of using the package extend buyond 
the study of statistics: students are introduced to a 
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FIGURE ONE, Either of these analyses could be performed on the examination 
scores of 60 males and 40 females to determine whether the males **demonstrated 
their superiority'*. The analyses differ in their assumptions, and give different 
conclusions using level of significance, p=O.OS. In each analysis one begins 
by entering the data. This entry is not displayed in the figure. Li ne s mark e d 
wi t h the s ymbol ^ were typed by the student ; all others were t y |> e d by the 
Q co mputer , 
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tool that has a growing influence in academic and everyday 
life. 

Classroom experience with the package 
The development in 1973 of the Cornell statistical 
programs 12] was supported by a Nat al Fund for Medical 
Education grant to develop computer-related innovations 
for the education of medical students. The package was 
used in two biostatistics courses given at the Cornell 
Medical College in 1973 [3J and in a course on applications 
of computer science given at Cornell's School of Industrial 
Engineering in 1974. [4] 

The biostatistics courses 

In these courses students attended a 1/2 hour 
meeting in the room containing the computer terminals 
they would use. This meeting included a brief demonstra- 
tion, distribution of the user's manual, and information 
on logistics of computer access. Prior to the courses. 
Professor Mik^, her teaching assistant Ms. Mushinski, 
and nearly all the students, had no exposure to the statis- 
tics package, APL language, or the College's interactive 
computing facilities. After the courses were over, Ms. 
Mushinski reported: 

Questions and problems were restricted to a 
few misinterpretations of the instructions 
in the manual and general confusion regard- 
ing the terminal keyboard. In every instance, 
the problems were easily corrected and the 
students continued to use to terminals. 15) 

10 
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The package is intended to be a source of motivation, 
not of distaste. Yet no matter how excellent the package, 
an unreliable computer, or inaccessible computer terminals, 
will drive away most students. The Medical College, using 
the SECOS Computer Timesharing System, [6] fortunately had 
neither problei... And despite the fact that the students 
were busy and apprehensive, and their use of the package 
optional, their response was enthusiastic. Ms. Mushinski 
writes: 

Enthusiasm remained high as evidenced by the 
numbers of students who made use of the 
terminals and the APL Interactive Statistics 
Package as indicated below: 

— Of the 65 students in the required 
Introduction to Biostatistics course, 
approximately 50 or more than 3/4 of 

the class participated in the experiment. 
The Statistics Package was used with very 
little difficulty and primarily in rela- 
tion to the required homework problems. 
These students worked alone and in groups 
at the terminals. A few of the students 
indicated a desire to use the terminals 
on "more practical" or "more relevant" 
problems and they anticipated returning 
for such use later. 

— Of the students in the elective 
course, appro.ximately 10 or roughly 
one-' ird of the class used the APL 
Package on a fairly regular basis. 
These students were very enthusiastic 
about having access to the terminals 
and used them both in relation to their 
jobs and the weekly homework assignments. 
It is anticipated that a number of the 
staff members and a few of the medical 
and graduate students will continue to 
make use of the manual and the system 

as their needs require. 
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The applioations of computer science course 

Students in the course were assigned two problem sets 

to be worked using the package. They were not given instruc 
tion in using the package nor any demonstrations. They 
learned to use it solely from reading the manual and from 
trial and error. Although the teacher of the course, Pro- 
fessor Severance, and the computer personnel at the "public 
terminals" knew nothing about the package, they could and 
did assist the students in establishing connection to the 
computer. 

After eight weeks of the course, each student was 

asked to write an evaluation of the package, along with 

suggestions for how it might be improved. By and large, 

the students thought well of their experience [71: 

The manual is, for the most part, quite 
clear. The abundance of examples serves 
to clarify many possible hazy points. 

Most questions of syntax can be dealt 
with during actual use. The immediate 
response of the system also aids in 
learning what the programs can do.... 

After comparing the length of time it 
took to do the first problem set [by 
hand] compared with the second [using 
the package] , no one in our group had 
any complaints with the statistics 
package . 

The students raised some problems: 

It is very difficult for groups of 
5 to huddle around the APL terminal. 

The researcher who desires extremely 
flexible or specialized analysis is 
not going to bo able to make extensive 
use of the package. 
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Only one problem mentioned by the students arose from the 

medical nature of the package design; 

»*.there is some type of p-value given and 
no explanation of whether it's a 9 1-^a / or g . 

The students* most common criticism was that the conversational 

nature of the package, which made it easy for beginners to use 

the computer, became a burden once beginners gained experience. 

Many students suggested the same remedy for this. For example: 

Perhaps Mr. Diegert could devise some system 
of making the amount of supportive information 
printed out by the computer depend on the ex* 
perience of the user. Perhaps three categories 
would be appropriate: EXPERIENCED, SHAKY, AND 
INEXPERIENCED. 

Students also suggested making the workspace bigger, tests of 
fit for more types of distributions, graphical display of 
regression output, covariance computation, and computation of 
more measures of a sample's central tendency. 

Availability of the package 
To use the package one must establish an APL billing 
account with a timesharing computer system that facilitates 
use of the APL computer language. In the engineering course, 
a unique account number was issued to each student, thus 
making individuals accountable for the computer charges 
incurred. 

The package consists of statistical analysis procedures, 
vi2» four "APL workspaces", that must be stored within the 
computer system. Usually this is done by the machine opera- 
tors. These workspaces are normally stored in a "public 
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library" (as in the SECOS system) , making them available to 

1 

J anyone with an APL account number using the system. Requests 

^ to export these workspaces to other computer installations 

i 

may be made to: 

1 Center for Environmental 

J Quality Management 

468 Hollister Hall 
. Cornell University 

J Ithaca, N.Y. 14850 

Copies of the user's manual are also available, at a cost of 

] 

J one dollar per copy. 



"29 March 1974 



Jeric 



FOOTNOTES 



(11 Existing APL statistical packages require that students 
know the particular format of the input and output data for 
each routine, jnd be reasonably fluent in APL language. Cf. 
K.W. Smillie, "STATPACK 2: An APL Statistical Package," 2nd 
ed.. Publication #17, Department of Computer Science, Univer- 
sity of Alberta, 1969; and J. Prins, "Statistical Programs 
in APL", 4th ed. , State University College, New-Paltz, New 
York, 1972. In many cases it is appropriate to ask students 
to meet these requirements. But the professor, the students, 
or both, are often not willing — and perhaps should not be 
expected — to learn to interact with the computer at this 
level of detail. 

[21 Carl Diegert, "An APL Interactive Statistics Package", 
Center for Environmental Quality Management Reprint #1039, 
Cornell University, Ithaca, New York, 1973. 

(31 "Biostatistics" — two courses offered by the Department 
of Public Health of the Cornell. University Medical College, 
New York City; second trimester, 1973; taught by Dr. Valerie 
Mikl. 

(41 "Applications of Computer Science in Industrial 
Engineering and Operations Research" — a course offered 
by the Department of Operations Research at Cornell. Univer- 
sity, Ithaca, New York; spring semester, 1974; taught by 
Dr. Dennis Severance. 

(5] Correspondence from M.H. Mushinski, Senior Research 
Assistant, Cornell Medical School, April 1973. 

(61 Shared Educational Computer System, 50 Market Street, 
Poughkeepsie, New York. 

(71 Student evaluations courtesy of Dr. Severance. 



